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Further observations on the relationship between the number of 
ovules formed and the number of seeds developing in Cercis 

J. Arthur Harris 
(with four text figures) 

I. Introductory Remarks 
In an earlier paper,* I discussed upon large masses of data 
drawn from series of trees in different habitats the relationship 
between the number of seeds maturing and the number of ovules 
formed in the legume, Cercis canadensis. The analysis of the 
more homogeneous collections from individual trees was then 
reserved. This is now undertaken. 

II. Analysis of Data 
A . The Meramec Highlands Collections 

For convenience of treatment merely, I recognize two series, 
the first comprising 10 trees from which a relatively large number 
of pods were taken and the second embracing 100 trees from 
which 50-100 pods each were gathered.f 

Table I gives the data and Table II the essential constants 
for the 13 large samples. 

That the samples differ from tree to tree is especially con- 
spicuous in the averages. Means such as 3.56, 3.75, 3.88, 4.26, 

* Harris, J. Arthur. On the relationship between the number of ovules formed 
and the number of seeds developing in Cercis. Bull. Torrey Club 41 : 243-256. 1914. 

t From three trees of this latter collection a much larger number of pods was 
taken; they are, therefore, included here. The first hundred pods are also treated 
below, where the small samples from individual trees are discussed. 

[The Bulletin for October (41: 483-532) was issued 28 O 1014.] 
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TABLE I 

Ovules and Seeds Developing per Pod 
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TABLE II 
Physical Constants for Thirteen Individuals 
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TABLE III 

Differences and Probable Errors of Differences in the Standard Deviation 

of Ovules and Seeds and in the Correlation of Ovules and Seeds 

for Selected Individuals of Cercis 
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TABLE IV 
Relationship between Number of Ovules per Pod and Number of Seeds per 

Pod 
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2.19 



4.76, 5.13, 5.54 and 6.15 with probable errors always in the 
second place of decimals and ranging from .016 to .057 for ovules 
are so clearly significantly different that it is needless to calculate 
probable errors. The differences for mean number of seeds are 
also clearly significant. 

The standard deviations for both ovules and seeds present a 
problem more difficult of solution by mere inspection. Values 
like .63, .68, .71, .74, .78, .85 and .93 with probable errors not 
exceeding .04 in any case appear to be significant. Differences 
exceeding their probable error as widely as those for the few 
random pairs given in Table III certainly indicate that the trees 
must be regarded as individual in variability as well as type. 

For both ovules and seeds the coefficients of variation differ 
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conspicuously, ranging from 13.12 to 22.12 for ovules and from 
20.37 to 34.87 for seeds. Unless the values of means and standard 
deviations are perfectly correlated — an assumption which we 
have no reason to make — one would expect the coefficients of 
variation to show considerable fluctuation in magnitude from 
the influence of the means alone. 

Since the constants showing the mean values and the vari- 
abilities of ovules and seeds per pod and perhaps those for the 
correlation of these two characters as well (see Table III) differ 
significantly from individual to individual it is clear that there 
may be serious disadvantages in lumping together the materials 
from different trees to form a large sample to be used in the 
investigation of delicate correlations. 

Turn now to the correlation coefficients. Table IV gives the 
results. Here as in the two random samples and their combi- 
nation there is a considerable correlation between the number of 
ovules per pod and the actual number of seeds developing. 

Trees 1, 5, 7, 8, and 9 have been selected for the test of line- 
arity of regression because of the large number of pods available. 
The equations to the regression straight lines are given in the 





Equation to Regression 




Tree 


Straight Line 


Deviation* 


I 


s = .284 + .705 


.065 


5 


s = .653 + .657 


.042 


7 


i = .652 + .658 O 


.050 
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s = .194 + .781 O 


.030 


9 


s = .100 + .754 


.048 



accompanying table. The average weighted deviation of obser- 
vation from theory of each tree is also given. Graphs have been 
made showing the empirical means and the fitted straight line 
for each tree. Only that for tree 9 need be given, as figure 1. 
Perhaps the fit is slightly better than for some of the others, but 
the observations are numerous. All of the graphs show unusually 
good agreements of theoretical and empirical means. The 
theoretical and the empirical means differ on an average by only 
about five-hundredths of a seed. 

* Weighted mean dev iation, disregarding signs, of observed from theoretical 
mean number of seeds per pod. 
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The coefficients for the correlation between the number of 
ovules per pod and the capacity of the pods for developing their 
ovules into matured seeds as determined from the formula 



V 7 ! - rj + (r os — v„/v s ) 2 



are all very low and are either positive or negative in sign, for 
these individual trees — presumably more homogeneous than the 
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Fig. i. Regression line and empirical means for tree 9. 

general samples. Regarding probable errors, only four of the 
thirteen can be considered to differ significantly from o. Ten 
of the thirteen are negative. All those which are significant 
with regard to their probable error are negative in sign. 

Smaller collections from 100 trees were gathered shortly after 
those described above for the purpose of having more numerous 
individual samples for testing the results already stated. Separate 
correlation tables showing the relationship between the number 
of ovules formed and the number of seeds developing were drawn 
up for each tree. Of course it is impossible to publish all these. 

The constants which interest us here are (a) the coefficient of 
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correlation for the number of ovules formed and the number of 
seeds developing per pod, and (b) the coefficient of correlation 
between the number of ovules per pod and the deviation of the 
number of seeds per pod from the probable number on the assump- 
tion that the number of seeds per pod is proportional to the 
number of ovules. These constants are presented in Table V in 
a form which will be clear without further comment. 

The values for all the trees (large and small collections) are 
seriated in Table VI. 

At the outset of the work I had thought that perhaps by the 
collection of a sufficiently wide series of individuals I might find 
some in which the coefficient of correlation between the number 
of ovules formed and the number of seeds developing would be 
very low. The results for the no trees show, however, that 
not only are all the values positive but that in every case they 
are of a substantial order. The lowest value entered in Table VI 
is .325 and the highest .850. 

The constants of these correlation coefficients calculated with 
Sheppard's correction, are: 

Mean, -5994 ± .0075 

Standard Deviation, .1165 ± .0053 

Coefficient of Variation, 19.431 

It seems of considerable interest for our present problem to 
try to ascertain whether this observed variation in the magnitudes 
of the correlations between the number of ovules and seeds 
represents a real biological difference in the individual trees 
examined, or whether it is merely a statistical result, attributable 
to the fact that constants for each tree are based upon a small, 
not immensely large, sample of its pods. I believe we may 
make some progress as follows. 

The standard deviation of random sampling of r is (i — r 2 )lVn, 
where n is the number of individuals included in the sample on 
which r is calculated. Let us assume now that the true value of r 
is the same for each individual tree and that it has the value 
found as our mean, say r = .600. This seems reasonable. 

If n for each individual were 100, a condition holding for 60 
of the trees but only with rough approximation for the remaining 
50, one would expect 
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TABLE V 
Correlations for Series of Pods from Individual Trees 
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TABLE VI 
Distribution of Coefficients of Correlation, r t 
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ov = 



i - .6o 2 
Vioo 



.0640. 



This would have its probable error 



.67449 X .0640/1/220 = .0029.* 

By comparison we get: 

Empirical, S. D. = .1165 ± .0053, 

Theoretical, S. D. = .0640 ± .0029, 

Difference = .0515 ± .0060. 

The difference is over 8 times its probable error and I think indi- 
cates that there are real biological differences in the individuals. 
The constants for the correlation between the number of ovules 
per pod and the deviation of the number of seeds from their 
probable value, r 0ZI for the 60 trees with 100 pods per tree and 
for the 40 other trees with less than 100 pods are summarized 
from Table V in Table VII. For a grand total of 1 10 trees I add 

TABLE VII 
Frequencies of Values of r M for Series of Individuals 
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8 
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2 



the constants obtained for the 10 individuals from which all the 
pods were taken. 

These results are also shown graphically in Fig. 2. Here the 
length of the lines indicates the magnitude of the correlation, 
i. e., the amount by which it deviates from o, and the nature and 



* The 10 trees yielding more than 100 pods each have been included. 
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the direction of this line, the sign. The firm lines and solid dots 
below the zero bar stand for negative coefficients, while the 
broken lines and circles above the zero bar indicate the magnitude 
of positive coefficients. The ten large trees are omitted. 

Taking these results as they stand it would appear that the 
relationship between the number of ovules per pod and the 
capacity of the pod for maturing its seeds may vary, being some- 
times positive and sometimes negative. The light and dark line 
areas in the diagram are nearly equal, and their means lie about 
equally distant from the zero bar. 

But one cannot accept any statistical constant — average 
standard deviation or coefficient of correlation — as absolutely 
correct as a description of the material from which the sample 
investigated was drawn; all are too large or too small by an 
amount known to mathematicians as the probable error of random 
sampling. Is it not possible that the considerable range of vari- 
ation in the constants tabled here is due to purely statistical 
causes and has no biological significance whatever? I think we 
may proceed as follows. 

Assuming that there is no relationship between the number of 
ovules per pod and the capacity of the pod for maturing its seeds — 
i. e., that r oz = o — the deviation of the empirical means and the 
empirical standard deviation due to the probable errors of random 
sampling from the theoretical o may be determined. I illustrate 
with the 60 trees furnishing each 100 pods. 

The standard deviation of the coefficient of correlation is as 
pointed out above (1 — r 2 )/V ' n. Clearly where n = 100 and r oz 
is actually o one would expect a standard deviation of 0.10 due 
merely to the errors of random sampling. But this standard 
deviation itself would have a probable error which for the 60 
trees from which I have 100 pods would be .67449(0. ilV\20)., or 
.0062. 

One must expect, therefore, if r oz be actually o, to find a 
standard deviation of 0.1000 ± .0062 in the coefficients for the 
60 trees with 100 pods each due to no organic cause whatever but 
solely to the errors of random sampling. 

The probable error of the mean is .67449ov/l/ra. Substituting 
values with a = 0.1 as just indicated I find 
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E a = .67449 X 0.1/ 1 '60 = .0087. 

This line of argument has been followed out in the preparation 
of Table VIII where for convenience I assumed that n = 100 for 
each tree.* In this table, the constants actually found are com- 
pared with the values one should expect them to have if the corre- 

TABLE VIII 
Mean Values of r ol for the Individual Trees 





60 Trees 


40 Trees 


no Trees 




.0000 ±.0087 

— .0292 ±.0106 


.0000±.0I07 ; 

-.oi75±.oi54 | 


.0000 ±.0064 

— .0232 ±.0080 






Difference 

Theoretical standard deviation 
Calculated standard deviation. 


— .0292 ±.0137 

.1000 ±.0062 

.I2l8±.0075 


— .oi75±.oi87 

.1000 ±.0075 : 
.i440±.oio8 . 


— .0232 ±.0102 

.1000 ±.0045 

.1242 ±.0056 




.0218 ±.0097 


.0440±.oi3i ' 









lation between the number of ovules per pod and their capacity 
for maturing their seed were actually o, and the constants as 
found from actual collections were due merely to the errors of 
random sampling. 

For all three series the mean differs from o by less than 2.5 
times its probable error. For the standard deviations the differ- 
ence between the observed and the theoretical values is less than 
2.5 in the case of the 60 trees and only about 3.3 times its probable 
error in the other two series. 

Notwithstanding these low values of the means and the nearly 
equal areas of plus and minus values on the diagram, one must not 
lose sight of the facts, (a) that there are more negative than 
positive coefficients, the ratio being 57 : 43 for the smaller col- 
lections and 64 : 46 for the series from all the individual trees, 
and (b) that all the means are negative in sign. 

This series like those discussed above seems to indicate that if 
there is any relationship between the number of ovules per ovary 
and the capacity of the ovary for maturing its ovules into seeds 
it is of such a nature that the ovaries with the larger numbers of 
ovules are slightly less capable than those with the smaller 
numbers. 

* This is not strictly true; I think that the approximation is quite close enough 
for present purposes. Sheppard's correction was used for the empirical distribution. 
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B. The Individuals front the Vicinity of Lawrence, Kansas 

The relationship for the whole material has already been 
discussed. For the individual trees the results as shown in 
Table IX are most interesting. Of the 22 values of r oz , only 1 has 

TABLE IX 



Tree 

I 

2 

3 

4 

5 

6 

7 

8 

9 
10 



13 
14 
15 
16 

17 
18 

19 
20 
21 
22 



Correlations for Individuals 

r ot and Er oa r oz and Ey 03 

.671 zb .037 — .134 ± .066 

.692 zb .035 + .099 zb .099 

595 ± .044 — .052 db .067 

.604 zb .043 — .204 zb .065 

.580 db .045 — .172 dz .065 

549 zb .047 — .131 zb .066 

400 zb .057 — .182 zb .065 

380 zb .058 — .189 zb .065 
365 zb .058 - .341 zb .060 

.365 zb .058 — .319 zb .061 

554 ± -047 - -174 ± -065 

•394 ± -057 - -189 ± -065 

,609 dz .042 — .030 zb .067 

391 zb .057 — .263 zb .063 

381 db .058 — .274 dz .062 
543 zb .048 — .117 zb .067 
537 zb .048 — .253 zb .063 
181 zb .065 — .322 zb .060 
572 zb .045 — .022 zb .067 
644 zb .039 — .212 zb .064 
514 zb .050 — .085 zb .067 

.635 zb .040 — .288 zb .062 



the positive sign. Taking the ratios of the constants to their 
probable errors to test their significance I find that the single 
positive constant deviates from o by an amount equal to its 
probable error; hence no significance can be attached to it. Of 
the 21 negative coefficients 14 differ from o by more than 2.5 
times their probable error. The whole situation is summed up 
graphically in diagram 3 in which these ratios are plotted out 
with the signs of the correlations. If there were no real biological 
correlation between the number of ovules per pod and their ca- 
pacity for maturing their seeds, the distribution of these ratios 
would be centered at o, marked by the "theoretical mode and 
mean" line, with the deviations about equally distributed above 
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and below. What one does find is that the empirical mode falls 
on the class — 3.50 to — 2.50. The area of the polygon which 
is composed of statistically significant constants is shaded in. 




Jz 


2; 


-i 
< 

us] 

0! 

I 
i-t 



6.50 
-5.50 


—5. SO 
— 4.50 


-4.SO 
— 3.SO 


-3.50 
-2.50 


-*.SO 
-1.SO 


— 1.50 
— .50 


— .50 
+ .SO 



Fig. 3. Distribution of ratios of r oz coefficients to their probable errors, showing 
large number of significantly negative correlations for the eastern Kansas series. 



C. The Individuals from the Vicinity of Sharpsburg, Ohio 

In this collection, the lumped data gave values of r os = .455 
± .009, r oz = — .034 ± .011. Thus the correlation between 
number of ovules and the capacity of the ovary for maturing its 
ovules into seeds while negative in sign is not only low but is 
only 3.15 times its probable error. 

The calculated correlations for the individuals are set forth 
in Table X. 

These results differ essentially from those secured for the 
Kansas series in two regards; (a) only 6 out of the 26 may be 
considered statistically significant with regard to their probable 
error as compared with 14 out of 21 in the Kansas series; (b) the 
constants are about evenly distributed between positive and 
negative, there being 14 positive and 12 negative signs. The 
general average is, however, negative. 

Again, the ratios of the constants to their probable errors are 
plotted out in a polygon (fig. 4) showing the scatter of the con- 
stants on either side of zero. 
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TABLE 


X 




Correlations for 


Individuals 


Tree 


r„ and .£V os 


r m and E r 


I 


.511 ± .041 


- .079 dz .055 


2 


.790 ± .021 


+ .078 dz .055 


3 


■ 566 zfc .037 


+ .022 dz .055 


4 


.614 ± .034 


+ .205 dz .055 


5 


.310 ± .050 


- .066 zfc .055 


6 


.581 zfc .036 


+ .184 zfc .055 


7 


.396 ± .046 


+ .083 zfc .055 


8 


.426 ± .045 


- .057 zfc .055 


9 


.254 ± -051 


+ .020 zfc .055 


10 


.425 ± .045 


- .058 zfc .055 


ii 


.324 zfc .049 


— .060 zfc .055 


12 


.485 ± .042 


+ .056 dz .055 


13 


.614 ± .034 


- .074 ± .055 


14 


.119 ± .054 


— .164 dz .054 


IS 


.524 zfc .040 


+ .047 zfc .055 


16 


.567 ± .037 


+ .108 zfc .054 


17 


.400 ± .046 


— .280 zfc .051 


18 


.454 ± -044 


+ .071 zfc .055 


19 


.440 zfc .044 


+ .041 dz .055 


20 


.342 ± .049 


+ .137 ± -054 


21 


.338 ± .049 


- .183 dz .053 


22 


.470 ± -043 


— .062 zfc .055 


23 


.614 dz .034 


- .075 ± -055 


24 


.416 dz .045 


— .064 dz .055 


25 


.386 ± .047 


- .089 ± .055 


26 


.262 ± .051 


- .086 dz .055 




•5-SO -3.5© — 2 .SO — tJMt 
-4. SO -2. SO — 1.SO — -SO 



-.SO + .so 


+• 1.50 +2.50 


-J-3-SO 


+ .so - ».so 


+ 2.SO + 3.SO 


+ 4. SO 


tios of r 0<a to Rr 


a , Ohio series. 
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III. Summary and Discussion 
The investigations described in this and the preceding paper 
establish several points concerning fertility and fecundity. The 
following may now be cited. 

(a) The physical constants — type, variability, and corre- 
lation — of the number of ovules per pod and the number of seeds 
developing per pod in Cercis canadensis differ sensibly from 
individual to individual and from habitat to habitat. The data 
do not, however, justify the conclusion that the trees from the 
different habitats are to be distinguished taxonomically. 

(b) The correlations for number of ovules formed and number 
of seeds developing per pod, r os , have always been found positive 
and of a moderate, considerable or even high intensity. 

This is true for the pods of an individual tree as well as for a 
mixed sample from a considerable series of trees. The correlation 
coefficient is slightly raised by the combination of collections from 
different individuals. 

(c) Regression is sensibly linear, both within the series of 
pods from the same individual and in a population of pods from 
many individuals. Possibly, however, there is a departure from 
linearity in the pods with eight ovules, but in my largest series 
there are only 36 of these pods out of a total of 28,554; this number 
is too small to be given great importance. 

The significance of the linearity of regression is two-fold. 
Statistically, it justifies describing the interdependence between 
the number of ovules formed and the number of seeds maturing 
by the coefficient of correlation. Biologically, it shows that the 
rate of increase in number of seeds developing per pod remains 
the same as we pass from pods with the lowest to pods with the 
highest numbers of ovules. 

(d) Wherever large series of pods have been examined, the 
correlation between the number of ovules per pod and the capacity 
of the pods for maturing their seeds, r oz , has a negative sign and a 
low, usually a very low, magnitude. When the number of pods 
is relatively small — say about 100 as in the case of the correlations 
from individual trees — the coefficient is sometimes positive. 
These results may well be due to the probable errors of random 
sampling which, with samples of this small size, may be quite 
large enough to screen such a slight relationship. 
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In such cases the number of negative values is generally larger 
than the number of positive coefficients, and their mean numerical 
magnitude is always higher. For every large series examined the 
value of r oz has been over 2.5 times its probable error and some- 
times many times its probable error. These evidences can leave 
little doubt of the existence of a slight negative relationship 
between the number of ovules formed and the capacity of the 
pod for maturing its ovules into seeds, the pods with the larger 
number of ovules producing relatively fewer seeds. 

This conclusion has also been reached in an earlier paper for 
the dwarf varieties of Phaseolus vulgaris as a whole. 

(e) The foregoing conclusions and other statements made in 
these pages apply exclusively to the one species considered and 
should not be extended to others except on the basis of actual 
data. There is no reason to assume that species may not differ 
in this regard. The data available for another of the Leguminosae, 
Robinia* indicate that quite different conditions from those found 
in Cercis may prevail. If the correlations found for Sanguinaria\ 
are based on sufficiently large and representative samples they 
lead to the same conclusion. There are strong evidences that 
some strains of Phaseolus differ from others in the sign of this 
relationship. Indeed the Kansas series of Cercis differs rather 
conspicuously from others in the intensity of the negative corre- 
lation. 

The conclusions concerning capacity for seed development 
here drawn are based upon mature pods only. One of the most 
important things to be done is to determine the relation of this 
phenomenon to the intra-individual selective elimination of ovaries, 
if it occurs in Cercis. All of the data here discussed were col- 
lected before this differential failure of ovaries in Staphylea was 
demonstrated. As yet I have been unable to obtain adequate 
materials for solving the problem for Cercis. 

(/) This paper is exclusively a statement of observed facts. 
I have no explanation to offer of the relationships which have been 
regularly found when adequately large series of data have been 
analyzed. Theories as to the causes underlying the conditions 

* Harris, J. Arthur. Biometrika 6: 441-442. 1909. 
t Harris, J. Arthur. Biometrika 7: 321-324. 1910. 
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observed seem to me, in view of the numerous difficulties of the 
problem, premature. Upon the painstaking collections of facts 
in regard to natural phenomena, whether or not they can be 
lined up with current theories, seems to me to rest the real advance 
of biology. When more comprehensive data are available — many 
of which are already collected and in an advanced stage of reduc- 
tion — it will be much safer to consider causal phases of the phe- 
nomena. 

Cold Spring Harbor 



